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The financial crisis marked a paradigm shift, from traditional studies of individual risk to recent 

research on the 'systemic risk' generated by whole networks of institutions. However, the reverse 
effects of realized defauhs on network topology arc poorly understood. Hero wc analyze the Dutch 
interbank network over the period 1998-2008, ending with the global crisis. Wc find that many 
topological properties, after controlling for overall density effects, display an abrupt change in 2008, 
thus providing a clear but unpredictable signature of the crisis. By contrast, if the intrinsic het- 
erogeneity of banks is controlled for, the same properties undergo a slow and continuous transition, 
gradually connecting the crisis period to a much earlier stationary phase. This early-warning signal 
begins in 2005, and is preceded by an even earlier period of 'risk autocatalysis' characterized by 
anomalous debt loops. These remarkable precursors are undetectable if the network is reconstructed 
from partial bank-specific information. 

PACS numbers: Valid PACS appear here 
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I. INTRODUCTION. 

Financial and banking systems are strongly interconnected networks of institutions exposed to both endogenous and 
exogenous fluctuations [H [2] . When defaults occur, they cascade throughout the network and can cause the collapse 
of an entire system, as dramatically witnessed by the recent financial crisis [3]- As a consequence, the analysis of 
economic and financial networks as the transmission channel for critical events can propagate has received a lot of 
attention [4Vll5j. Much effort has been devoted to the search for regularities in the structure of financial networks, such 
as a strong degree of heterogeneity, a core-periphery or a modular structure [SHH]- Similarly, null models have been 
introduced in order to understand whether part of the observed topological complexity can be explained relatively 
simply in terms of the observed heterogeneity of vertices fTU'-TJ' . For interbank networks specifically, a lot of attention 
has been devoted to quantifying the level of systemic risk (the risk of the collapse of the system as a whole) determined 
by a particular network topology, as opposed to the traditional measures of risk defined for individual banks p!5HT5] . 
ft turns out that the minimization of (standard measures of) individual risk can often increase the level of systemic 
risk, which in turn can hurt individual financial entities PQI2]- This highlights the inadequacy of traditional models 
and regulation and suggests an analogy with ecological networks [HI [20] . 

All the above approaches focus on the structural properties, or on the level of systemic risk, associated with a given, 
static network topology. However, interbank networks are highly dynamic. Besides their 'ordinary' rearrangements due 
to the normal activity of banks, they are also likely to sometimes show major structural changes as banks are forced to 
adapt to unpredicted stress events, like defaults of other banks and/or financial crises. As defaults start to propagate, 
previous financial connections might disappear and new ones might appear, modifying the way further reverberations 
of a crisis are channeled through the network. Therefore, in order to adequately understand the complex interbank 
phenomenology, one needs to take into account both the (expected) effects of network topology on the stability of the 
financial system as well as the reverse effects of (realized) defaults on the structure of interbank networks. This has 
led to models of interbank networks that dynamically adapt to critical events, with a continuous feedback between 
topology and dynamics [21j . In this paper, we follow this line of reasoning but, rather than introducing a theoretical 
model, we carry out an empirical characterization of the interplay between realized financial stress and the changes 
in the observed interbank structure. The adaptive nature of interbank markets deepens the analogy with ecological 
networks, which also feature this evolutionary property. 

We address two main questions: does the topology of an interbank network undergo major structural change as a 
crisis suddenly manifests itself? And if so, are there any topological precursors of this structural change, to be used as 
early-warning signals of the approaching crisis? Our results indicate that the answer to both questions is affirmative. 
We also find that, surprisingly, that whether we find early-warning signals does not depend on the topological property 
being monitored but on the null model being used as a reference. 

II. DATA. 

In order to carry out our analysis, we focus on the recent global financial crisis, that became manifest at the end of 
2007 and continued throughout 2008 [3 , and on its latent build-up phase, which is much more difficult to identify. We 
selected a dataset reporting 44 quarterly snapshots of the Dutch Interbank Network (DIN in the following), starting 
from the first quarter of 1998 and ending with the last quarter of 2008 [6 . Each snapshot reports the exposures between 
Dutch banks during the corresponding quarter, and represents them as binary links directed from the borrower to the 
lender. Our data includes the year when the crisis manifested itself in its strongest form (2008) plus the preceding 10 
years, which would most likely include the build-up phase. More details about the data are given in the Appendix. 
Note that, when studying the propagation of defaults, the binary topology of interbank networks (i.e. whether a 
link from one bank to another exists or not) plays the primary role. The magnitude of the connections, while being 
important surely, plays only a quantitative role. For instance, the existence and uniqueness of a 'clearing payment 
vector' that clears the obligations of all banks after a default only depends on the interbank topology [18]. Moreover, 
while a weighted network is of course more informative than its binary projection, recent empirical results f l2H14| 
have shown that the knowledge of a binary property (e.g. the binary degrees) often conveys more information about a 
real- world economic network than the knowledge of the corresponding weighted property (e.g. the weighted degrees). 
For these reasons, we are interested in studying purely topological patterns such as the abundance of non-isomorphic 
subgraphs or the core-periphery structure. Also note that, since weighted network properties are highly sensitive to 
quantitative fluctuations in the intensity of connections even when the topology remains fixed, and since in stressed 
conditions such fluctuations are likely to be large, studying only binary network properties allows us to factor out 
purely quantitative fluctuations, and to focus on genuinely topological changes and major structural transitions. Our 
expectation that a topological analysis conveys the most important piece of information about the evolution of the 
DIN will be confirmed at the end of our analysis, when interpreting our results in the light of the (under)estimation 
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FIG. 1. Left: observed number of vertices (black) and links (gray). Right: observed density of the whole network (black), of 
the core (red) and of the periphery (brown). The j/-scale is logarithmically spaced in both cases. 



of systemic risk. 



III. TOPOLOGICAL SIGNATURES OF THE CRISIS. 



We start by looking for possible topological signatures of the crisis. We find that, at the onset of the crisis, the size 
(numbers of vertices N) and connectedness (number of links L) of the network do not show any significant change in 
their (roughly stationary) trends (see fig. [ijleft). A similar result holds for the link density (or connectance c), which 
is the fraction of realized to possible links (see fig. [T] right). We can also separately consider the density of the core 
and of the periphery of the network, defined as two non-overlapping sets of banks that provide the best approximation 
to the so-called Core-Periphery (CP) model |6j, a specific modular property of real interbank networks. The ideal 
CP model assumes that core banks are all bilaterally linked with each other, that periphery banks do not lend to 
each other, and that core banks both lend to and borrow from at least one periphery bank (discussed further in [B] 
and Appendix). From the right panel of fig. [l] we confirm that the core is much denser than the periphery (as by 
construction it should be). However, the core- and periphery-specific densities, exactly as the overall density, show 
only a slight jump from the end of 2007 to the beginning of 2008. The size of the change is not significant, as it is of 
the same order as the fluctuations characterizing the entire 11-years time interval. Taken together, the above results 
show that the size and density of the network are completely uninformative about the crisis, especially if we would 
be looking for early warning signals. 

However, the picture changes if, after controlling for the size and density themselves, we consider higher-order 
(dyadic, triadic, and so on) topological properties. We first focus on the abundances of the three possible dyadic 
motifs in the observed network, i.e. the number L'^ of reciprocated (full) dyads, the number L~*' of non-reciprocated 
(single) dyads, and the number L*^ of empty dyads (see fig. [2|. These numbers are informative only after filtering 
out mere size and density effects, or more complicated topological properties. Therefore, here and in what follows, 
we compare each measured quantity X with the expected value {X) under a null model which has some properties 
in common with the observed network, and is otherwise maximally random. Technically, the method we adopt is an 
analytical and unbiased one [22J based on maximum-entropy ensembles of graphs with constraints [36j (see Appendix 
for details). In order to detect the statistically significant deviations from the expected value {X), we also consider 
the standard deviation a[X] (calculated under the null model) and define the z-score 

_ X-{X) 

= -^[xT 

The z-score is a standardized variable measuring the difference between the observed and the expected number 
of dyads in units of standard deviation. If X is normally distributed under the null model, then values within 
z = ±1, z — ±2, z — ±3 would (approximately) occur with a 68%, 95%, 99% probability respectively. If the observed 
value of X corresponds to a large positive (negative) value of zx then the quantity X is over(under)-represented in 
the data, and not explained by the null model. We will make extensive use of ^-scores. For most of the topological 
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FIG. 2. Temporal evolution of the dyadic ^-scores: zl^ under the DRG (top-left, purple circles) and the DCM (top-right, blue 
circles), zl^ under the DRG (middle-left, purple, full squares) and the DCM (middle-right, blue, full squares), zl*^ under the 
DRG (bottom-left, purple, empty squares) and the DCM (bottom-right, blue, empty squares). 



properties we consider (i.e. dyads and core-periphery structure), the normality under the null model is either trivially 
ensured by the Central Limit Theorem (CLT), or checked numerically (see Appendix). In our case the CLT cannot 
be invoked for triads due to statistical dependencies among the random variables involved (triads necessarily share 
dyads, and are therefore not independent of each other). Nonetheless, larger z-scores are still believed to identify 
more significant patterns, and have been widely used in the past ^01 

In the left panels of fig. [2] we show the temporal evolution of the z-scores for each of the three dyadic motifs, 
under a null model that controls for the size and density of the network. The Directed Random Graph (DRG), the 
directed version of the Erdos-Renyi random graph model (see Appendix) is such null model. We find that, while the 
size and density of the network are relatively stable throughout the entire period, all the dyadic z-scores undergo 
an abrupt jump in 2008. The crisis period (highlighted in ochre in fig. |2| is characterized by a sudden decrease of 
the abundance of full and empty dyads, and a sudden increase of the abundance of single dyads. Note that, before 
the crisis, the number of reciprocated dyads is significantly larger than the expected one, while during the crisis it 
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becomes marginally consistent with the null model (i.e., the random graph). This trend is of course mirrored in the 
behavior of the reciprocity, as we have confirmed in detail (see Appendix). Similarly, the observed abundances of 
single and empty dyads become consistent with the null model in 2008. Since the total number of links is more or 
less stable, the net effect we see is that reciprocal connections suddenly 'decouple' and fill previously empty dyads, 
making single dyads increase and empty dyads decrease. So the network seems to suddenly evolve from a fluctuating 
but roughly stationary configuration (with few single dyads and many full and empty ones) to a 'crisis' configuration 
whose dyadic structure is marginally consistent with that of an unstructured random graph. We denote this sudden 
loss of structure as the 'collapse' of the original network. The dyadic motifs, when using the DRG as a reference, 
are therefore clear topological signatures of the crisis. They are however unpredictable, since they show an abrupt 
transition with no evidence of a previous build-up phase. They allow us to 'see' the crisis, but not to 'foresee' it. 



IV. EARLY-WARNING TOPOLOGICAL PRECURSORS: THE PRE-CRISIS PHASE. 

We now show that, surprisingly, the picture changes entirely as we consider a more stringent null model where the 
intrinsic heterogeneity of banks is accurately controlled for. In particular, we compare each snapshot of the network 
with a null model (known as the Directed Configuration Model (DCM) where the number of in- and out-going links of 
each bank (e.g. the in- and out-degree, respectively) is kept equal to the observed values, and the network is otherwise 
random (see Appendix). Note that the degree refiects a variety of properties of a bank, particularly their size (larger 
banks are involved in more transactions and have more partners, as also captured by the CP model). For a discussion 
of the properties of the degree distribution(s) of this network, see [5]. 

Note that the DRG is an unlikely benchmark economically as the degrees of all banks are narrowly distributed 
around their empirical average value. This corresponds to an unrealistically homogeneous interbank market (in terms 
of size and virtually any other property of banks). As a consequence, when studying the deviations of the real 
network from the DRG, we cannot disentangle the effects of unrealistic bank-specific properties from those of genuine 
higher-order (dyadic and beyond) patterns. Incidentally, this shows the main limitation of the representative agent 
concept when applied to economic networks [27^. By contrast, the DCM indirectly preserves the real heterogeneity of 
banks, by preserving the observed degrees produced by that heterogeneity. This provides a realistic benchmark with 
deviations indicating a genuine signatures of higher-order effects beyond the bank-specific level, directly arising from 
the choices of banks. 

The second column of fig. [2] shows the dyadic z-scores under the DCM. When comparing these values with the 
previous ones obtained under the DRG, we find surprising results. Firstly, during the first 7 years (quarters 1-28, 
i.e. from 1998 to 2004 included) all z-scores are stationary and have the same sign as under the DRG, but are much 
closer to zero. Their smaU absolute value (|z| < 2.5) actually suggests that during this period the dyadic structure 
of network is not far from the DCM's prediction, i.e. it is roughly explained by the heterogeneous degrees of banks. 
By contrast, starting from the 29th quarter (the first of 2005), all z-scores suddenly change sign and start to move 
away from their previous stationary values. This gradually leads to the collapsed network configuration of 2008. The 
network is then the most distant from the DCM (and, as we observed before, the closest to the DRG). However, the 
'collapse' is not a sudden structural change, as it is clearly preceded by a 3-year 'pre-crisis' period (from 2005 to 2007 
included, highlighted in purple in fig. [2]) transitioning the earlier stationary phase to the 2008 crisis. It is remarkable 
that the trends of all the three dyadic patterns agree on the temporal location of the pre-crisis phase. Our results 
shown so far suggest that, with respect to a homogeneous benchmark (i.e., the representative agent scenario), the 
interbank network displays an abrupt structural transition at the onset of the crisis. On the other hand, with respect 
to a heterogeneous benchmark carefully controlling for the different connectivities of real banks, the transition is slow 
and continuous, and highlights a gradual build-up phase starting three years in advance of the crisis. The pre-crisis 
phase is thus an early-warning signal of the upcoming topological collapse. 

Topological patterns captured by dyadic motifs are limited to correlations within pairs of vertices. In order to 
study a higher level of organization, we now analyse the triadic motifs [201I24H26] . i.e. the possible patterns involving 
three connected vertices. Triadic motifs can reveal important functional aspects of real complex networks [241 125j . 
including ecological ones [20j. Exactly as for the dyads, we consider the z-scores for the abundances of each of the 
13 triadic motifs (see Appendix for definitions). However, before considering the temporal evolution of individual 
z-scores, we first identify the most significant motifs by comparing all 13 z-scores with each other in sub-periods. 
This results in the 'motif profiles' [24, 26 shown in fig. [s] where we used the DCM as the null model. It turns out 
that the 44 quarterly snapshots do not collapse to a single profile (see Appendix). By contrast, we find that four 
subperiods with different characteristic profiles can be clearly distinguished, as evident from the four panels of fig. 
[3| Within each of the four subperiods, the profiles are coherent and indicate that the triadic structure is to a great 
extent stationary. Remarkably, we find that the last two subperiods coincide exactly with the pre-crisis (2005-2007) 
and crisis (2008) periods we identified before. The first two periods run from the beginning of 1998 to the first half 
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FIG. 3. Triadic z-scores for all the 44 quarters, grouped into four subperiods, under the DCM. First subperiod: ti-tio (top-left); 
second subperiod: tii-t28 (top-right); third subperiod: t29-i40 (bottom-left); fourth subperiod: i4i-t44 (bottom-right). 



of 2000, and from the second half of 2000 to the end of 2004. These findings anticipate two results we will show in 
more detail below: first, triadic motifs confirm the existence of a distinct build-up phase preceding the crisis; second, 
they further dissolve the 1998-2004 period into two subperiods that were undistinguishable in our dyadic analysis, as 
a result of the fact that, since the same dyad can belong to different triads, triadic motifs can disentangle patterns 
that are mixed together at a dyadic level. 

We investigate the first point here, and address the second one in the next section. As is clear from the bottom 
right panel of fig. [sj we can identify the motifs number 2, 5, 10 and 12 as the most significant (|z| > 4.5) triadic 
signatures of the 2008 crisis. If we now track these motifs over time (see left panels of fig. |4]), we find exactly the 
same behavior as shown above for the dyads: the trends over the entire 1998-2004 period are stationary (with small 
z-scores indicating an approximate accordance with the DCM) , and from 2005 onwards they gradually evolve towards 
the collapsed configuration (for motif 10 the departure actually starts before 2005, but this anomaly will be corrected 
by a more constrained null model, as we show below). Therefore we confirm that, as for dyads, the triadic z-scores 
under the DCM reveal early-warning signals of the crisis, while this in general not true (see Appendix) if the DRG is 
used as a benchmark. This confirms that the building-up of the crisis is undetectable under homogeneous assumptions, 
while it becomes manifest in the gradual divergence of the real interbank market from the configuration expected on 
the basis of the observed heterogeneity of banks. 

Clearly, since triads are combinations of dyads, some triadic motifs might be over(under)-represented just because 
the dyadic motifs they contain are over(under)-represented, in which case the triad as a whole should not be considered 
an interesting pattern per se (see Appendix for a detailed discussion of the interplay between the dyadic and triadic 
structure). In order to control for this, we introduce a more stringent null model that separately controls for the 
number of single and reciprocated links of each vertex (Reciprocal Configuration Model (RCM), see Appendix for 
details). The RCM separately preserves the number of empty, full, and single (out- and inward) dyads in which each 
vertex is involved. As a result, all the observed dyadic abundances are preserved and the dyadic z-scores are zero by 
construction. In the right panels of fig. [4] we show the triadic z-scores for the same motifs considered previously, but 
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FIG. 4. Temporal evolution of the 2-score for motifs 2, 5, 10 and 12 under the DCM (left) and the RCM (right). 



now recalculated under the RCM. We find that motif 2 shows the same trend as before and motif 10, falling in line 
with it, now revealing the beginning of the pre-crisis phase in 2005. This indicates that motifs 2 and 10 are important 
building blocks of the network. Motifs 5 and 12 are no longer significant (|z| < 3.5), and their fluctuating trends do 
not show any appreciable change during the pre-crisis and crisis periods. 

Combining the results so far, and after checking also the evolution of the nine triadic motifs not shown in fig. |4j we 
find that all the three dyadic motifs, plus the triadic motifs that are still significant after filtering out dyadic effects, 
show clear early-warning signals of the topological collapse, and all agree on the existence of precisely the same 3-year 
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second subperiod: tii-f28 (top-right); third subperiod: t29-i40 (bottom-left); fourth subperiod: ti-^-tm (bottom-right). 

long pre-crisis phase. The only exception is motif number 9, whose intriguing behavior is shown in the next section. 
V. THE EARLIEST PRECURSOR: ANOMALOUS CIRCULAR LENDING. 



What remains to be explained is the nature of the separation (occurring in the mid of 2000) between the first two 
subperiods shown previously in fig. [3j as all the trends considered so far do not display any significant change at that 
particular point in time. Clearly, we might - and should ~ also wonder what fig. |3] would look like under the RCM, 
and investigate the origin of possible temporal subdivisions in that case as well. We find similar results in both cases. 
As before, the motif profiles calculated under the RCM over the entire 1998-2008 period do not collapse to a universal 
distribution (see Appendix). Still, inside each of the four subperiods we identified earlier, the profiles are coherent 
(see fig. [5]). This confirms, besides the identification of a long pre-crisis phase, that the middle of 2000 temporally 
separates two different structural regimes. Remarkably, the first regime is now almost completely consistent with the 
null model (|z| < 4 for all 13 motifs), which means that the heterogeneous local connectivity and reciprocity of banks 
entirely explain the triadic structure. Moreover, if we now look more closely at figs. [3] and [5j we find that the main 
differences between the first two subperiods are determined by motifs 9 (under both the DCM and RCM) and 10 
(under the DCM, but not the RCM). 

Thus the only significant change occurring in the middle of 2000, after controlling for the dyadic structure, is due 
to motif 9. The temporal evolution of the latter is reported in fig. |6j under both null models. We find that, from 
the third quarter of 2000 to the last quarter of 2004, motif 9 indeed shows a marked difference with respect to the 
rest of the period, and turns out to be strongly over-represented, highlighting an anomalously high number of triads 
of banks involved in circular lending loops with no reciprocation. Since this subperiod is only characterized by the 
over-representation of motif 9 (all other motifs are still approximately consistent with the RCM), we denote it as 
the 'cyclic anomaly' phase (highlighted in figs. [5]and|6]), and regard it as the earliest precursor of the 2008 crisis. 
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FIG. 6. Temporal evolution of the 2-score for motif number 9, under the DCM (left) and under the RCM (right). 



Remarkably, when the cyclic anomaly phase ends and the pre-crisis phase begins, motif 9 suddenly changes from being 
the strongest over-represented to being the strongest under-represented motif under the RCM (and not significant 
under the DCM). Thus, it appears that non-reciprocated lending loops, that were the arrangement preferred by triads 
of banks before 2005, suddenly became the 'most avoided' triad. The following two periods (pre-crisis and crisis) are 
indeed mainly characterized by an increasingly strong under-representation of motifs 9 and 10 (see figs. |4j [S] and |6| , 
which both involve a circular lending loop. 

VI. HIGHER-ORDER AND CORE-PERIPHERY STRUCTURE. 

We have identified a hierarchy of topological patterns, from local to higher-order ones, which are increasingly 
informative about the crisis and the preceding phases: overall size and density effects do not show any significant 
temporal change throughout the entire period; if size and density are controlled for (DRG) , dyads highlight the crisis 
but not its building-up; if the local (bank-specific) connectivity is controlled for (DCM), both dyads and triads reveal 
the 'pre-crisis phase'; if the local reciprocity (thus the complete dyadic structure) is also controlled for (RCM), triads 
further reveal the 'cyclic anomaly' phase. 

At this point, we should ask three related questions. How many higher-order levels in the hierarchy should we 
further explore? Can we condense the resulting information into some of the low-level patterns that we have already 
analyzed? and What is the most appropriate choice of quantities to control for? While exhaustive answers would 
require additional extensive analyzes, partial conclusions can be drawn quite easily as follows. First, we wonder 
whether the core-periphery structure of the network (which is an example of global property, in principle irreducible 
to smaller building blocks of two, three, four vertices and so on), can be traced back to the simple local constraints we 
considered so far. To this end, given the best partition of the real network into core and periphery, we checked how 
closely this partition is reproduced by the three null models considered. We did this by comparing the measured and 
expected values of the error score e (distance from the ideal CP model) and of the density contrast Ac (the density 
difference between core and periphery). Our results show that both e and Ac are never consistent with the DRG, 
that e is always consistent with both the DCM and the RCM, and that Ac deviates from the DCM prediction during 
the pre-crisis and crisis phase, but is always consistent with the RCM (see fig. [7]and more in Appendix). Note that 
the negative and decreasing values of zac during the pre-crisis and crisis phase indicate that the network actually 
develops an increasingly 'anti-core' structure where the excess core density is even smaller than the one expected in a 
random network with the same banks' connectivities. This means that, at best, the observed core-periphery structure 
is of the same nature of the triadic motifs number 5 and 12: it highlights a pre-crisis (anti-core) phase by deviating 
from what expected by controlling for the numbers of lenders and borrowers of each bank (DCM), but it is no longer 
significant if the number of reciprocal partners is also controlled for (RCM). This result suggests that the observed 
core-periphery structure, and presumably also other high-level properties of the network, can be simply explained in 
terms of the building blocks we have already identified, and should not be considered a fundamental feature of the 
interbank market. 

Second, we checked which of the three null models, should be regarded as the most appropriate model to use. To this 
end, we used Akaike's Information Criterion (AIC) in order to find the model with the best combination of accuracy 
and parsimony. The DRG is always and significantly outperformed by both the DCM and RCM, which are sometimes 
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FIG. 7. Evolution of the z-score for the excess core density under the DCM (blue) and the RCM (green). 



in competition with each other. To better select between the DCM and the RCM, we compared their Akaike weights. 
The result is that the DCM is, for most snapshots, the best model, although there are two periods of time (one during 
the cyclic anomaly phase and one in the year preceding the crisis) when the best model is the RCM (see Appendix). 
This suggests that, most of the time, the relevant information to control for is already parsimoniously encoded into 
the number of lending and borrowing partners of each bank. Therefore we do not expect that controlling for even 
more complicated constraints than the ones enforced by the RCM (i.e. controlling for triadic patterns themselves) will 
result in a competitive null model. The degree sequence is presumably the appropriate hierarchical level to control 



Third, we quantified the amount of information (Shannon's entropy) that the knowledge of the constraints defining 
the three null models conveys about the whole interbank network, and studied whether this information changes 
significantly over time. We found (see Appendix) that, as expected, the RCM always conveys more information than 
the DCM, which in turn always conveys more information than the DRG. However, the information gained in going 
from the DCM to the RCM is very small, which explains why most of the time the Akaike weights select the DCM 
as the more balanced model. Furthermore, we find that the temporal evolution of the DRG entropy, just like that of 
the link density, does not show any particular indication of the crisis (see Appendix). Similarly, the DCM and RCM 
entropy undergo only a minor jump from 2007 to 2008, thus failing to provide any early-warning signal. These results 
confirm that the precursors of the crisis are not visible from the information encoded in the degree sequences, and 
must be looked for in higher-order topological properties. 

Combining all the results we found so far, we conclude that the most informative patterns throughout the evolution 
of the network are all the three dyadic motifs (which, for most of the time, are not trivially explained by the best 
null model), plus the triadic motifs not explained by the dyads themselves (i.e. the triads number 9, 10 and, with 
less significance, 2). As the crisis approaches, the network seems to progressively lose the structure it had, in terms 
of both its reciprocity and the distinction between a core and a periphery. Interestingly, the reciprocity plays a role 
also in leveling out the core-periphery difference: reciprocal links in the core decouple to create single dyads in the 
periphery (see fig. [T]), leveling out the density contrast. Moreover, during this decoupling process, nodes seem to 
accurately avoid creating directed triangle loops as the evolution of motifs 9 and 10 witnesses (see fig. [6]). 



VII. POLICY IMPLICATIONS AND THE SHORTCOMINGS OF BANK-SPECIFIC INFORMATION. 

The above results have potentially strong implications for bank regulation policies. An immediate one is that the 
popular view that real interbank markets consist of a well defined core-periphery structure, and consequently that 
banks can be binarily classified either as big/central or as small/peripheral, might be too simple. Our findings show 
that the observed heterogeneity of banks is irreducible to the core-periphery dichotomy. Rather, the opposite is true: 
given the observed heterogeneity of banks, the network is found to have no significant core-periphery structure, and 



for. 
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sometimes even has an 'anti-core' one. 

The approximate consistency between the real network and the RCM in the initial 1998-2000 period also suggests 
that, in periods of calm, the topology of real interbank networks might be quite accurately reconstructed using only 
the knowledge of the number of (inward, outward, and reciprocated) partners of each bank. Technically, this means 
that, under low stress, real interbank networks might be typical members of an equilibrium statistical ensemble of 
graphs, where banks' connectivities are maximally informative. In practical terms, it means that to characterize the 
network, data requirements are very limited. This intriguing conjecture, which calls for future empirical tests, is 
at the moment supported by an analysis of the Italian interbank network in 1999 carried out in [221 . and creates a 
parallel with other economic networks, such as the World Trade Web, that are characterized by exactly the same type 
of equilibrium property [T^HHl [25] . 

However, and more importantly, our findings also show that during the build-up of crises the network can become 
increasingly move away from the expectations derived only from the knowledge of bank-specific properties. In this 
out-of-equilibrium regime, the local connectivities of banks become less and less informative about the network as 
a whole. This loss of topological predictability speaks against the use of maximum-entropy techniques aimed at 
reconstructing the most likely configuration of an (unobserved) interbank network when only local information about 
the total assets and liabilities of each bank is available [TT] . Since assets and liabilities are the (transaction) weighted 
counterparts of the in- and out-degree of a bank, our results suggest that this technique might yield a realistic guess 
of the real network only in quiet times. When the network is under stress, maximum-entropy would instead provide 
a greatly distorted picture of it. Strikingly, if our analysis had been carried out on the most likely network consistent 
with the observed degrees (i.e. the DCM), then every dyadic, triadic, or core-periphery property would have appeared, 
at each point in time, as perfectly consistent with the configuration model. We would not have identified or predicted 
any structural collapse and regime shift: the crisis and its precursors appear to hide themselves in higher-order (with 
respect to the degrees) topological properties. Regulation policies based only on bank-specific information, and not 
on the knowledge of the entire network, are thus likely to remain unaware of major warning signals and fail to early 
diagnose ongoing worrisome structural changes. 

The interpretation of the detected dyadic and triadic patterns, postponed until now, is also quite revealing. A 
reciprocal connection between two banks (full dyad) is known to be an indication of preferential lending, i.e. an 
enhanced trust of both parties in each other [2S]. Thus, the gradual decrease of reciprocity during the pre-crisis and 
crisis period might be associated with a slow disintegration of trust. It is also clear that the difference between the 
DCM and the RCM is that the former only controls for the heterogeneous connectivies of banks, while the latter also 
controls for each bank's tendency towards preferential lending. 

Moreover, both reciprocity and triadic structure have implications for counterparty risk assessment. As other 
authors have already pointed out [511153], Over-The-Counter (OTC, i.e. not disclosed to third party) transactions 
intrinsically generate risk externalities: if bank A issues loans to banks B and C, it will require an interest rate that 
depends on the estimated counterparty risk (which is a function of the fluctuating financial variables on which the 
'health' of banks B and C depend). But if B issues another loan to C, and A is not aware of this, the interest rate 
claimed by A will underprice counterparty risk, since B becomes vulnerable to a default of C, increasing the correlation 
between the health of B and that of C. Note that this particular triad is motif number 5. This example shows that 
the binary topology of interbank networks (more than the intensity of links) has direct effects on systemic risk, and 
also highlights that some triadic motifs are strongly affected by risk externalities. Now, it should be noted that the 
unreciprocated 3-loop (motif number 9) maximizes the underestimation of risk in OTC transactions: each of the three 
banks involved is not aware of the fact that counterparty risk loops back to itself, creating strong correlations their 
risk premia do not sufficiently protect against. This suggests that, during the cyclic anomaly phase, banks might have 
systematically underestimated risk externalities, determining a potential for the crisis to build-up in the following 
phases. 

Note that circularity itself is not necessarily associated with strong risk externalities; but unreciprocated circularity 
is. For instance, within a full dyad, risk loops back between the two banks as well. But in this case both parties are 
aware of it, and can properly price the increased correlation in their interest rates. Also note that, while full dyads 
are still prone to the risk externality involving a third party, this will be a smaller effect since the probability that 
risk loops back along a longer chain of defaults is smaller than that of risk looping back within the dyad itself. Thus, 
at a dyadic level, single dyads are the most prone to the underestimation of counterparty risk, precisely because they 
can become parts of unreciprocated loops. Again, this effect is purely topological: in a mutual dyad with two positive 
but strongly asymmetric weights, both banks can still incorporate these weights to properly price their risk. Only 
if one weight is zero, i.e. in a single dyad, this is no longer possible. This further explains why the key information 
relevant to us is encoded in the binary topology, and not in the intensity of connections. By contrast, at a triadic 
level, 3-loops involving an increasing number of reciprocated dyads (motifs number 9, 10, 12 and 13 respectively, see 
Appendix) are increasingly less prone to the risk externality. Unreciprocated loops (of any length greater than 2) 
can therefore be considered to be a sort of 'autocatalytic risk loops'. Since longer loops imply smaller probabilities of 
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cascading defaults, presumably the most dangerous autocatalytic risk loops are precisely those involving three banks. 

During the cyclic anomaly phase, all the partly reciprocated loops (motifs 10, 12, 13) were much less abundant 
than the completely unreciprocated 3-loop (and always consistent with both null models (DCM and RCM)) thus 
increasing systemic risk. Only during the pre-crisis phase, the loops with small or no reciprocation (motifs 10 and 
9) became increasingly under-represented (figs. |4]and|6|. Unfortunately, during the same period, reciprocated dyads 
(that dominated the earlier phase) also became increasingly under-represented, and outnumbered by single dyads (fig. 
[2]). This suggests that, starting from 2005, the underestimation of systemic risk might have progressively increased, 
first due to autocatalytic risk loops during the cyclic anomaly phase and on a simpler, dyadic level during the pre-crisis 
phase. 

These considerations show that OTC transactions have the potential to create unintentional but emergent, self- 
reinforcing, and destabilizing patterns and feeds into the debate on how OTC markets can be monitored and regulated. 
Our results on 'risk autocatalysis' suggests that, even when banks trust each other and spontaneously engage in 
reciprocated transactions, autocatalytic risk loops can emerge (in the cyclic anomaly phase, the reciprocity is still 
high). Therefore simply requiring that banks reciprocate a fair amount of transactions (unless this amount means 
all transactions) is not enough in order to prevent the creation of unreciprocated loops. The measures to be taken 
appear to be at an irreducibly triadic level. One possibility is to introduce a Central Clearing Counterparty (CCP) 
who would step in the middle of bilateral OTC trades. Although this would reduce the systemic risk due to private 
interaction, it does introduce the (systemic) risk that the CCP can fail as well. Another, less intrusive approach is to 
start properly monitoring OTC markets and act on anomalous motifs. Given the data collection efforts underway in 
for instance the UK and internationally at the Bank for International Settlements this can be readily implemented 

mm)- 

Although our results are strong on providing early warning signals, they are weak with regard to explaining the 
economic rationale for the network patterns. As the links are formed in an OTC market, the participants only 
knowingly create the dyads, not the triadic motifs. Further research is needed to understand the economics of 
network formation. 



VIII. CONCLUSIONS 



Motivated by the need to complement recent one-way analyzes of systemic risk induced by the topology of financial 
and interbank systems, we have empirically investigated the reverse and less studied effects of realized financial 
crises on the structure of interbank networks. Studying the evolution of the Dutch interbank market over an 11- 
year period ending with the 2008 worldwide crisis, we found clear topological signatures of the crisis, a result that 
motivated us to look also for possible topological precursors. Over the period considered, the interbank market 
turns out to evolve from an initial stationary configuration, which is strongly and almost entirely determined by the 
observed heterogeneity of banks, to a final collapsed configuration whose dyadic structure is approximately consistent 
with that of an unstructured random graph. We found that, surprisingly, this transition can appear either abrupt 
(unpredictable) or continuous (predictable), the difference not found in the properties being monitored, but in the 
benchmark being used in order to assess the significance of such properties. With respect to a homogeneous reference 
resembling a representative agent assumption, the transition occurs suddenly at the onset of the crisis, with no 
earlier warning signal. With respect to a reference calibrated on the observed heterogeneity (of connectivity and 
reciprocity) of banks, the transition is instead slow and gradual, and the otherwise latent building-up of the crisis 
becomes manifest. This occurs as early as three years in advance of the crisis, and is preceded by an even earlier phase 
during which surprisingly many triads of banks are involved in circular lending loops with no reciprocation. Such 
subgraphs are presumably the most dangerous 'autocatalytic risk loops' associated with a systematic underestimation 
of counterparty risk externalities. At the beginning of 2005, these loops suddenly become under-represented, while 
other triadic motifs start to become increasingly significant, slowly leading the network to its collapsed configuration. 
Our results suggest that bank regulation policies could greatly benefit from monitoring different hierarchical levels of 
interbank networks. Special emphasis should be put on checking for potential departures from the expected stationary 
structure induced by the different connectivity and reciprocity of banks. Conversely, analyzes and policies based only 
on bank-specific information, and not on the entire interbank network, appear destined to remain unaware of ongoing 
structural changes that could precede major critical events. 
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IX. APPENDIX 



A. Data 



Data are taken from the database used in [B] by using prudential reporting of balance sheet positions of Dutch 
banks. The data includes all the exposures between Dutch banks (from contractual obligations to swaps) up to one 
year and of more than 1.5 million euros, on a quarterly reported frequency (from 1998Q1 to 2008Q4) jB]. In other 
words, the data covers forty-four time-periods, corresponding to the forty-four ends of quarters of eleven years (from 
1998 to 2008) and shows only the existence (or not) of exposures between (anonymizcd) Dutch banks of more than 
1.5 million euros. Note that the last four temporal snapshots correspond to the year 2008, i.e. the fir st year of the 
self-evident crisis, whose beginning can traced back to August 2007 (as already pointed out in [B]), i.e. the sixth-last 
temporal snapshot considered here. 

Given the nature of the available data, the Dutch interbank network (DIN) is represented as a binary, directed 
network where vertices represent banks and links represent exposures: a link pointing from bank i to bank j, at time 

indicates the existence of (at least) an exposure of more than 1.5 million euros, directed from i to j, registered 
at the end of the particular quarter t. The number N of banks varies from period to period, oscillating between 91 
and 102 (see main text). For each quarter t (with t = 1,. . . ,44), the structure of the network is therefore entirely 
described by an x (in general asymmetric) adjacency matrix A, whose entries are Uij = 1 if a binary directed link 
from bank i to bank j exists during that quarter, and a^j = otherwise. The number L of directed links is therefore 
computed as 



N N 



^ = E E (2) 



And the link density, or connectance c, is 

L 



N{N-1) 



B. The Core-Periphery Model 

In the literature about interbank markets (see e.g. ref. [6 ), the Core- Periphery (CP) model is a popular axiomatic 
model, describing an ideal interbank network where nodes are perfectly split in two different classes: core-nodes and 
periphery-nodes. Thus, the core and the periphery are two non-overlapping sets of vertices, defined by means of the 
following three axioms [B]: 

Al: core banks are all bilaterally linked with each other; 
A2: periphery banks do not lend to each other; 

A3: core banks both lend to and borrow from at least one periphery bank. 

These axioms describe an ideal structure like that shown in fig. |8] 

Of course, real interbank networks differ from a perfect core-periphery structure as postulated by the model. 
Nonetheless, it is still possible to look for the optimal partition of vertices between core and periphery, i.e. the 
partition which minimizes the number of deviations from the CP model. In particular, three types of deviations, or 
'errors' can be defined, depending on which of the three axioms is violated [B]. For instance, if there is only one link 
between two banks that have been assigned to the core (instead of the postulated two links in the CP model), this 
is considered as one error of the first type. And if there is one link between two periphery banks, this is considered 
as one error of the second type. Finally, if a core bank does not lend (borrow) to the periphery at all this is assigned 
as many errors as there are periphery banks. For each of the three types of deviations, the 'error score' is simply the 
number of errors of that type. The total error score (for a given partition) is the sum of the scores of all three types 
of errors: 

e = ei + £2 + £3 

The optimal partition is the one that minimizes e, i.e. the one that best approximates the CP model. 
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FIG. 8. An ideal example of core-periphery model: banks A, B, C belong to the core, and the other ones to the periphery. 
After ref. [B]. 



We used a computational algorithm 6j that looks for optimal solutions. Given the resulting partition, we measured 
the number Nc of nodes in the core and the number Np = N — Nc of nodes in the periphery. Similarly, we measured 
the number of links in the core as 

Lc= ^ aij (4) 

i^j, G core) 

and the number of links in the periphery as 

Lp= ^ tttj = L - Lc (5) 

^¥^j: {hj £ periphery) 

(from now on, the symbol X^i^^j indicates the two nested sums X^i^i Sj(^i)=i)- we can distinguish between a 
core-connectance and a periphery- connectance, respectively defined as 

Lc L„ 



NciNc-l)' " NpiNp-1) 



The analysis of the above quantities is shown in the main text. In section |IXF we employ the CP model more 
intensively. 



C. Null models 



The approach presented here rests upon the exponential random graphs formalism [2211321 - 157] : given an appropriately 
chosen set of graphs (in what follows the grandcanonical ensemble Q, i.e. the collection of graphs with the same number 
of nodes of the observed network and a varying number of links, from zero to N{N — 1)) a probability coefficient like 
the following 



-H(A,e) 

P{A\e) = ^ (7) 

Z{6) 

is associated with each of them. In eq. ([t]), H{A, 6) = '^g_daT^a{A) is the Hamiltonian of the graph, i.e. the linear 
combination of topological contraints (dependent on the particular adjacency matrix. A) we choose to impose on the 
aforementioned ensemble |22j and the normalization constant, Z{9) = X^Aee e~^'^^' ^\ is the partition function. 

The unknown parameters can be estimated by maximizing the log-likelihood function of the network, \nC{9) = 
\n.P{A\6), with respect to the constraints [57] 
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or, equivalently, by solving 



= (7ra)(r) = (^,)*, Va (9) 



i.e. a list of equations imposing the values of the expected constraints to be equal to the observed ones (the term 
"expected", here, refers to the weighted average taken on Q, the weights being the probability coefficients above) 

[lani]. 

Once the numerical values 6* of the parameters have been determined, the expected value of any other topological 
quantity of interest, X{A), is simply given by: 

{Xy = J2 XiA)P{A\0*). (10) 

Aeg 

However, since the expected values of the most common quantities in complex networks theory are difficult to 
calculate exactly, it is often necessary to rely on the linear approximation method: {X)* ~ X{{A)*), with (A)* 
indicating the expected adjacency matrix, whose elements will be indicated as (aij)* = Pij- By means of the same 
approximation, it is also possible to calculate the standard-deviation of the quantities of interest, as described in |22j . 

If the chosen contraints are linear in the adjacency matrix elements (i.e. of the form tt{A) — X^i^i '^ij^ij) 
the expected entries become BernouUian functions of the unknown parameters: 

The next two subsections will be devoted to the explanation of the null models considered for the present analysis. 

1. Directed Random Graph Model 

The Directed Random Graph (DRG in what follows) is the most well-known null model in complex networks theory. 
Its Hamiltonian is composed by only one addendum, the total number of links: 

N N 

H{A, e)=aL = Y^ a aij. (12) 



Being a linear constraint, the probability coefficients have the functional form shown in eq. (Ill 



whose unknown parameter can be estimated by solving the likelihood prescription 



N N N N ^, 



Eq. (14) can be immediately solved to give p* = js/j^j^^ij = c{A). So, the observed connectance is nothing more 
that thcDRG probability that two any nodes be connected. The major drawback of this null model is that only 
one probability coefficient, p, accounts for the probability connections of every pair of nodes, thus ignoring their 
heterogeneity. A more refined model is presented in the next subsection. 
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2. Directed Configuration Model 



The Directed Configuration Model (DCM in what fohows) is characterized by the following Hamiltonian: 



N N N 

(where kl"-{A) = '^ji the in-degree of node i, fc°"*(A) = J2j(^i)=i ^ij the out-degree of node i): it is 

again a linear function of the adjacency matrix elements, leading to probability coefficients of the form 

- 1 + e-"^-ft - 1 + x.y, ■ ^ 
The likelihood prescription for the DCM becomes [22] 

ki"{A) = {kf")* = Y^j^^.^p*, = T.j(^i) T+l^' ^^^^ 

where the indices run from 1 to N . The in-degree and out-degree of a node are nothing more than the number of 
banks a vertex receives loans from and the number of banks a vertex lends to. Apart from considering trivial cases, 
the previous system can be solved only numerically. 



3. Reciprocal Configuration Model 



The third null model we considered is the Reciprocal Configuration Model (RCM in what follows), where each 
vertex has the same number of reciprocated, out-going non-reciprocated, and in-coming non-reciprocated links as in 
the observed network. In other words, the RCM incorporates not only the information about the number of (in- 
and outward) neighbors of a bank, but also the local reciprocity structure of each node, by means of three degree 
sequences defined by using the dyadic variables in eqs. 



26p8) [221 [Ml [39]: 



kti^) = at] = Y.j{^z) - a.y), V i (18) 



so that the resulting Hamiltonian becomes 



JV 

H{A, 6) = ("^^^"^ + P-^t + l^kr ) ■ (19) 

i=l 

The first degree counts the number of links coming to node i and not having a reciprocal partner, the second degree 
counts the number of links going out from node i and not having a reciprocal partner and the third degree counts 
the number of links outgoing from, or ingoing to, node i and having a reciprocal partner. The RCM imposes the 
above sequences as contraints on the grandcanonical ensemble. As a consequence, the likelihood condition prescribes 
to solve the following system [22] : 
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FIG. 9. Evolution of the reciprocity r of the interbank network. 



Once the unknown parameters have been numerically determined, the probability coefficients {ajj)* = {pjj)*, 
(a^)* = {pI~)* and (a*^)* = {ptj)* can be used to calculate the expected value of all the topological properties of 
interest. Note that the usual degree sequences are preserved under the RCM, because 



l,OUt _ L- 



(21) 



D. Reciprocity and dyads 

The reciprocity is the fraction of links having a reciprocal partner (i.e. 
[40] . and is defined as 

^ ^ L " ~L 



a link pointing in the opposite direction) 



(22) 



Unlike the number of vertices, the number of links, and the connectance, the reciprocity offers a clear signature of the 
crisis, as is evident in fig. [9j For most of the time period it shows an essentially constant trend, with small fluctuations 
around an average value of approximately 0.26, but the last four periods are characterized by an impressive decrease of 
the reciprocity value (approximately 40%): they lie almost 3 sigmas away from the sample average, clearly indicating 
that the DIN shows an anomalously low reciprocity value in those time periods already affected by the crisis. 

What about the periods before it? Just looking at fig. 2, there is no strong evidence of the upcoming event, an 
thus the reciprocity by itself isn't sufficiently informative about the near future. However, the statistical significance 
of this conclusion can only be stated after a comparison with a well defined reference, i.e. the null models introduced 
in the previous section. 

1. The p index 

Fig. |9]shows the effectiveness of reciprocity r in characterizing the first year of the crisis. How statistically significant 
is the observed trend? To answer this question, let us implement the DRG and the DCM to compare the observed r 
with its expected value: 



(23) 
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FIG. 10. Temporal evolution of the observed reciprocity r (black) and of p under the DRG (purple) and the DCM (blue). 



In order to do this, let us calculate the p index [38], defined as 



l-(r) 



(24) 



which automatically discounts for the effects of the imposed constraints. By definition, p ranges between 1 and — 1: 
in fact, the denominator is always positive and, in magnitude, smaller than the numerator. It simply normalizes the 
index, not contributing to the sign of the quantity itself which, in turn, is decided only by the relative magnitude 
between the observed value r and its expectation. A positive sign indicates a stronger than expected tendency to 
reciprocate whereas a negative sign, a tendency weaker than expected to establish reciprocal links. The trends of p 
calculated under the DRG and the DCM are shown in fig. [Tol 

The positive sign of the trend of p under the DRG (i.e. pdrg) indicates that the tendency of the network to 
reciprocate is stronger than expected under this model. This is intuitive by considering that (r) drg — P and that p is 
an average of all the single pair-specific probabilities, whose numerical value coincides with the observed connectance. 
In fact. 



=^jPij \L)dRG / \ /o^^ 

and, by using the likelihood prescription, (c) drg — c{A) for any matrix A of the time period considered. Given the 
low value of the connectance, the DRG predicts a low value for r as well, such that c < r for all the time-periods, 
thus underestimating the tendency of the links to establish mutual connections. Moreover, even if the DRG correctly 
describes the first year of the crisis (as evident by noting the final jump of pdrg)i this is exclusively due to the 
particular functional form of pdrg itself, being a simple, small translation (and rescaling) of r towards lower values: 
Pdrg = jE^ r — c. Thus, the discovery of patterns anticipating the crisis is trivially demanded to r which is, by 
itself, blind to this, as already pointed out. As a result, the network seems to suddenly depart from the initial (quite 
stable) configuration with many reciprocated links to the crisis configuration, where the number of reciprocal links 
sharply diminishes. 

Far more interesting is the trend of Pdcm- As a general comment, the network is more consistent with the DCM 
null rather than with the DRG one, as the smaller values of the respective p indices show. In more detail, the 
DCM highlights two opposite patterns. During the first twenty-eight periods, the tendency of the network is to be 
reciprocated more than expected (similar to the DRG, but with the difference that pdcm presents an almost constant 
trend, by showing smaller fluctuations than Pdrg)- this implies that even the specification of the entire in- and out- 
degree sequences is not enough to fully account for the observed reciprocity, as the positive value of pdcm witnesses. 
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FIG. 11. The 3 possible binary, directed dyads. 



The same is valid also in the second sixteen periods, with the difference that the network inverts the tendency and 
tends to be less reciprocated than expected, showing an almost perfect monotonic decrease (with small, constant 
jumps in the value of pncMi of approximately four periods each). This clear anti-reciprocal behavior, not detected by 
the DRG but revealed by the DCM (i.e. not encoded in the total number of links, but partially encoded in the degree 
sequences) may be an early signature of the upcoming crisis, as the nodes start avoiding mutual exchanges two years 
before the 2008. However, the absence of significance bounds for p prevents us from drawing definitive conclusions 
about this point. Moreover, p is a generic index, only describing the global tendency of links to reciprocate or not. 
This leads to the question how the single pairs of nodes behave, as discussed in the next section. 



2. Dyadic motifs 



In order to address the above points, we carried out a more detailed analysis of the reciprocity structure of the DIN, 
by looking at the possible dyadic motifs, i.e. the three ways any two given nodes can be (dis)connected in a binary. 



directed network (see fig. 111. Dyads provide detailed information about the local reciprocity structure of a network, 
by measuring how many single, or mutual, connections a given node has. The statistical significance of the p index 
temporal evolution can be confirmed by checking the statistical significance of the single dyads' behavior, which can 
in turn be measured by means of z-scores [2H [32] • The number of non-reciprocated (single) dyads, (twice) the 
number L*^ of reciprocated (full) dyads, and (twice) the number L*^ of empty dyads are defined below, along with 
the corresponding z-scores: 

='^a,j{l-aj^); ZL^ = (26) 



(27) 



The expected value and sigma of the dyads can be calculated analytically under both the DRG and the DCM 
considering that, for networks with local constraints, the dyads in a network are independent random variables |22) . 
The temporal evolution of the dyadic z-scores under the DRG and the DCM is portrayed in the Main Text (note that, 
under the RCM, each of the three dyadic abundances is reproduced by construction, which trivially implies that all 
dyadic z-scores are zero). 

The dyadic z-scores show the same behaviour as the p index. The explanation lies in the analytical form of such 
quantites: the numerators of the p index and the dyads are (except for a minus sign) the same. In fact 



1 _ L- {L^) ^^^^ 



and, considering that L = + L 



L- - = -(L-* - {in) - -(i^ - (30) 

Note that, under the RCM, all local and global dyadic properties of the real network are preserved. As a consequence, 
zl*^ , zl~* and z^^ are, by construction, zero. For the same reason, we also have prcm = 0, and the observed 
reciprocity structure is completely reproduced by the model. So, the RCM fixes the dyadic motifs and can reveal 
patterns of self-organization between dyads, pointing out how triads (or more numerous sets) of nodes interact. 
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FIG. 12. The 13 possible triadic motifs involving three connected vertices. 



E. Triads 



The 13 possible motifs involving three connected vertices are shown in fig. 12 The analysis of these triadic motifs 
has been carried out by implementing the DCM and the RCM, as shown in the Main Text. Here we provide details 
of that analysis, and more results relating to the DRG. 



Triadic z-scores 



For each motif to = 1, . . . 13, the abundance Nm (up to a constant factor am that depends on the symmetry of the 
particular motif, which will drop out of all measured quantities) is obtained as shown in table |l] 



Triadic motif (m) 


Abundance (Nm) 


1 


Ei7^j5^fc(l - a^J)aj,ajkil - akj){l - aifc)(l - ati) 


2 


J2i^ijtk ~ aji)ajk{l — akj){l — aik){l — aki) 


3 


Y^i^j^k aijajiUjkil - akj){l - aik){l - aki) 


4 


J2ij^ijik(^ ~ - aj»)cijfe(l - akj)aik(l - aki) 


5 


Ei7^,-7^fe(l- ~ aij)ajiajk{l — akj)aik{l — aki) 


6 


Yi^jjtk aijajiajk{i — akj)aik{l — aki) 


7 


Yi^i^k aijaji{l — ajk)akj{l — aik){l — aki) 


8 


Yi^jy^k aijajiajkakj{l - aik){l - aki) 


9 


Yijtij^,k(^ ~ aij)aji{i — ajk)akjaik{l — aki) 


10 


Yi^^j^ki^ ~ aij)ajiajkakjaik{l — aki) 


11 


Yi^j^k aij{^ ~ a,ji)ajkakjaik{l — aki) 


12 


y ]^-^.j-^k aij ajidjkakj aik(^^ aki) 


13 


y ]^-/.j-^k aij ajidjkakj aikaki 



TABLE I. Classification and abundances (up to a symmetry factor) of the 13 triadic motifs. The three nested sums run from 
1 to TV. 



The z-score for the abundance of a particular triadic motif reads 

ry N — (n N ) N 



(Nm) 



ct[N„ 



(31) 



2. Triadic structure under the DRG 



Under the DRG, the z-scores are easy to calculate analytically, considering that pij 
linear approximation to calculate the standard-deviations (as shown in ref. [H]): 



p, y i j and by using the 



(Nm) = Ti/(1 - pf-\ <JN^ = T2 [fc/-i(l - pf-'' - (6 - fc)/(l - p) 



(32) 



where /c = 2, 3, 4, 5 respectively for motif 2, 5 and 9, 10, 12, Ti = N{N — 1){N — 2) is the number of distinct, directed 
triads and T2 = {N - 2)^yN{N - l)p{l-p). The motifs considered here are the ones already considered in the Main 
Text: motifs 2, 5, 9, 10 and 12. The results are shown in figs. [13) [M] and [15] 

The z-scores under the DRG do not single out any further division in sub-periods, as shown in fig. |13| motif 12 
is the only one that displays variations, but the latter correspond to a non-monotonic, oscillating temporal trend, as 



shown in fig. 14 Motifs 2, 10 and 12 do not even signal the anomalous period of the crisis, as evident by looking 
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FIG. 13. Triadic z-scores for the motifs 2, 5, 9, 10, 12 in the quarterly snapshots 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, under the 
DRG. 



at fig. [MJ in fact, no significant departure from the previous periods-trend is appreciable. In this respect, the only 
useful trend is provided by motif 5, which shows an evident increase just before the beginning of the critical period. 
However, no evidence of a pre-crisis period is detectable in any of the four considered motifs, confirming the limited 
use of the DRG in providing useful predictions. So the above triadic motifs calculated under the DRG can at best 
confirm the same unpredictability as the dyadic ones (some triadic patterns undergo a sudden change in 2008), and at 
worst show fluctuations which are so large during the entire period, that it is difficult to tell whether any significant 
change is taking place at all at the onset of the crisis. 

Only motif 9 shows a more informative temporal evolution and partly confirms (even if with much less significance) 
the results we found for the same motif under the DCM and the ROM (see main text), as fig. 15 shows. Even if 
neither the crisis, nor the pre-crisis period are detected, the anomaly in the 'cyclic anomaly' period is still visible, 
appearing as a global increase of the z-scores' values before coming back to an almost constant value in the last 
temporal periods tso — t44. 



3. Triadic structure under the DCM 



Motifs' profiles under the DCM represent one of the most significant results of the whole analysis, unambiguously 
showing the evolution of the DIN, as clear by looking at the forty-four motifs' profiles plotted together in fig. [16] 
The apparent disorder resulting from plotting all the forty-four profiles together hides the four, stationary sub-profiles 
shown in the main text. We will discuss each of the sub-periods in turn. 

In the first subperiod (covering the time-periods between ti and tiQ - see main text), the DCM seems to explain 
quite well the motifs' profiles (almost all the z-scores lie between z = +3 and z — —3): the only outlier is motif 8, 
whose abundance is underestimated by the DCM (the z-score is positive). Looking back at fig. 12 this means that, 
even if the total number of reciprocal and empty dyads is consistent with the DCM prediction in this subperiod, the 
abundance of triads composed by two reciprocated dyads and one empty dyad is still underestimated by it. This 
suggests that the simple topological information about the number of neighbors is still not sufficient to account for 
this kind of dyadic interactions and more precise information about the nodes' local reciprocity structure is needed. In 
the second subperiod (between til and ^is), even if motif 8 is again underestimated, the biggest, evident discrepancy 
is between the abundance of motif 9 (rising with respect to the first ten years) and its expected value, while the 
other motifs remain quite stable. Motif 9 is a complete triad of single dyads. Its underestimation could be due to the 
presence of genuine self-organization patterns at the triadic level rather than the reciprocity structure being ignored. 
As in the previous case, the DCM does not account for them. The third subperiod (between ti9 and tio) shows several 
motifs becoming more significant than in the previous subperiods: motif 2 and motif 5 become overrepresented (i.e. 
underestimated by the DCM) and motifs 10 and 12 become undcrrcpresented (i.e. overestimated by the DCM). In 
the fourth subperiod (between 14! and ^44), these patterns become even more pronounced. The increasing divergence 
of these motifs from the DCM null could be another signature of the crisis, also by direct comparison with the dyadic 
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FIG. 14. Temporal evolution of motif 2 (top-left), motif 5 (top-right), motif 10 (bottom-left), motif 12 (bottom-right), under 
the DRG. 
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FIG. 15. Temporal evolution of motif 9, under the DRG. 



abundance: as the number of reciprocal dyads is overestimated in the year 2008, so motif 10 and 12 are, both defined 
by (at least) one reciprocal dyad; as the number of single dyads is underestimated in the year 2008 (becoming only 
marginally consistent with the DCM prediction) so motif 2 and 5 are, both defined by (at least) two single dyads. 



4- Triadic structure under the RCM 



What clearly emerges from the previous analysis is the fundamental role of the reciprocity structure of the DIN in 
describing both the crisis and the pre-crisis period. The DCM only partially accounts for the dyadic and the triadic 
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FIG. 16. Triadic ^-scores for all 44 quarters, under the DCM. 



structure of the network, highlighting the emergence of patterns not encoded in the degree sequences. In order to 
discover the presence of higher-order patterns, not encoded in the dyadic structure, the next step is to fix this kind of 
topological information beforehand as well. In other words, to disentangle dyadic effects from a genuine organization 
at the triadic level, we need to wash out the information about the reciprocity itself, by including it in our null model 
from the start. This results in the RCM with figure [T7| showing the associated triadic 2;-scores. It is clear at a glance 
that now, except motifs 9 and 10, all motifs are approximately consistent with the null model. This means that, 
after controlling for the dyadic patterns already identified, motifs 9 and 10 still emerge as strongly significant triadic 
building blocks, irreducible to a combination of dyads. 

The temporal evolution of these triadic z-scores (see main text) reveals that, out of the four motifs that are 
significant under the DCM in the third ('pre-crisis' phase) subperiod. Only motif 10 retains the same significance 
under the RCM, signalling that the information about the reciprocity is not sufficient to explain the abundance of this 
profile. Just as for motif 9, which sharply inverts its trend, motif 10 is composed of two single dyads and one reciprocal 
dyad, closing a triangle loop; this seems to confirm the additional presence of non-trivial third-order correlations. In 
other words, the network evolves from configurations with an exceptional, unexplained abundance of unreciprocated 
triangular loops to configurations where nodes strongly prefer avoiding them. The fourth ('crisis') subperiod is, again, 
a further evolution of the third one, showing motifs 9 and 10 evolving towards more strongly significant values. So, 
the first year of the crisis seems to be characterized by the strong absence of triangle interactions, while all the other 
patterns seem to be compatible with the RCM prediction. The evolution of motif 9 (see main text) confirms this, 
clearly indicating that a sort of triadic self-organization indeed exists and plays a fundamental role in shaping the 
interbank exchanges. 



F. Evolution of the core-periphery structure 



In this section we describe our analyses of the evolution of the core-periphery structure of the DIN. As described 
in section IX B we first looked for the the optimal partition of vertices providing the closest approximation to the 
CP model. As for the dyadic and triadic structure, we are interested in understanding whether the observed core- 
periphery structure is statistically significant, or whether it can be explained merely in terms of the local topological 
properties. 
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FIG. 17. Triadic z-scores for all 44 quarters, under the RCM. 



1. Error score 



To this end, we first studied whether the error score e measured on the real network (given the optimal partition) 
is consistent with the value (e) (given the same partition) expected under the null models considered so far. Given a 
null model, we therefore define the z-score 

= ^ (33) 



measuring by how many standard deviations the observed divergence from the ideal CP model differs from its expec- 
tation. As we now show, it is possible to evaluate analytically. 

Generally, the optimal partition does not contain errors of the third type, since the latter are severely punished 



(See section IX B for the definition). As a consequence, it is sensible and easier to restrict our attention to the errors 
of the first and second type, which measure the deviation from a 'relaxed' CP model without the third axiom. If 
Vc = Nc{Nc — 1) and Vp = Np{Np — 1) denote the volume (number of possible links) of the core and periphery 
respectively, we consider the following simplified error score: 

e EE ei + €2 = K - + (34) 

where Vc — is the number of missing (with respect to the CP model) links in the core and Lp is the number of 
extra (with respect to the CP model) links in the periphery. In an ideal core-periphery model we would have — Vc 
and Lp — 0, so that e = 0. Using the expression above, we can write 

(e) =Vc- (Lc) + (Lp) (35) 

Moreover, it is easy to check that 

a^[e]^a^[Lc]+a^[Lp] (36) 
Taken together, these expression yield the desired analytical formula 

_ e - (e) _ (Lp - Lc) - {Lp - Lc) 



a[e] ^a^Lc]+c7^[Lp] 



(37) 



Equation ( 37 1 shows that the fundamental variable appearing in the error definition is the difference between the links 
in the core and those in the periphery. 
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FIG. 18. z-score for the error score e under the DRG (purple), the DCM (blue) and the RCM (green) 



Armed with the above result, we employ the DRG, the DCM and the RCM and evaluate the corresponding z- 
scores. The results are shown in fig. [Tsj Using the DRG, we find that the observed core-periphery structure is 
strongly significant, as the DRG predicts a much larger expected error score resulting in larger negative values of z^. 
By contrast, using the DCM and RCM (which both give zero z^), we find that the real network is as close to the 
ideal CP model as are null models. The reason for z^ being almost identically zero is that, as can be easily checked, 
all the possible rewiring moves that preserve the degrees of vertices exactly also preserve the number of errors of the 
first and second type (note that this would imply (e) = and (T[e], making z^ undefined). Our different approach. 



where degrees are preserved on average (see section IX C ) , allows for errors not to be preserved exactly in each single 
realization of the network. This implies that a[e] > 0, making z^ properly defined. Still, we find that « 0, meaning 
that (e) is extremely close to zero. In other words, the observed core-periphery structure, as measured from the error 
score, is not a genuinely higher-order property, since it is simply explained by the degrees of vertices (for a discussion 
of this result see also |5]). 



2. Denstty contrast 

To double-check the above results, we introduced an alternative measure of the strength of core-periphery structure, 
namely the density contrast defined as 

Ac ^ c, - = ^^^^ (38) 



i.e., the difference between the link density in the core and the link density in the periphery (see section IX B I. This 
quantity is a very intuitive measure of the excess core density and, unlike the error score, is not trivially preserved by 
rewiring moves that preserve degrees, either exactly or on average. The z-score for the density contrast is 

_ Ac -{Ac) _ [v^ %) [-vT -y^J 

The evolution of zac under the three null models is shown in fig. [19] Clearly, the DRG prediction underestimates 
the contrast. This is easily understandable by looking back at fig. 1 in the Main Text: the only information the 
DRG uses is the global connectance, thus predicting a more homogeneous structure and a smaller difference between 
the two zones' density than observed. However, the first year of the crisis shows a jump towards smaller values of 
the z-score as if, exactly as pointed out by the reciprocity index p, in 2008 the network became sparser and actually 
more homogeneous (less difference between a "suspected" core and periphery), thus being in better agreement with 
the DRG. 
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FIG. 19. z-score for the density contrast Ac under the DRG (purple), the DCM (blue) and the RCM (green). 

The DCM and the RCM predict the same density contrast for most of the time period. The two models start 
deviating from each other approximately around period ^30, when the "pre-crisis" period starts: in particular, the 
DCM does not correctly account for the observed Ac, especially in the last periods (approximately two years, from 
to ^44), when the crisis spreads. Here, the trend of the z-score highlights that the DIN structure is actually more 
homogeneous than predicted by the DCM: the network, as it evolves towards the crisis, seems to actually lose a sort of 
internal structure, based on the different density of links in the core and periphery areas. Moreover, this information 
seems to be again encoded into the reciprocity structure of the network as the the RCM, which accounts for this, 
correctly explains the density contrast. This sheds new light on the evolution of the reciprocity itself, as pointed out 
in the conclusions. 



3. The distribution of Lp — Lc 



The z-scores have a well defined statistical significance levels only in the case of normally-distributed variables. We 
tested this assumption only for a single time period (tg), by proceeding in the following way. We implemented the 
DCM numerically for the chosen time period by generating 50.000 binary, directed matrices, according to the DCM 
rule 



1, ifu - C/[0,1] <p* 



Vj 



Gij = 0, else 



i.e. by extracting a real number uniformly distributed between and 1 and comparing it with p*j (whose numerical 
value was computed according to the maximum of the likelihood procedure), for each entry of the adjacency matrix, 
aij. Then, for each matrix M belonging to this numerically-generated grandcanonical ensemble, A^, we also calculated 
the distribution of the random variable Lp — L^, its arithmetic mean. 



Lp-L,- (40) 



and its standard deviation, 



<^^[Lp ^ — {Lp — Lc)^ — {Lp — Lc) 



(41) 
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FIG. 20. Histogram showing the distribution of Lp ~ on Ml (a numerically-generated grandcanonical ensemble according 
to the DCM probability coefficients and composed by 50.000 matrices), Gaussian fit with parameters fi — Lp — Lc and = 

{Lp — LcY — {Lp — Lc) (top, red) and Gaussian fit with parameters p = {Lp — Lc) and = {{Lp — Lc)^) — {Lp — Lc)^ (bottom, 
green) . 



The results are shown in fig. 20 



fits. The red fit is a Gaussian fit with parameters ^ = L. 



The histograms both show the same distribution of Lp 
Lc and — {Lp — Lc)'^ — (Lp 



Lc'. we simply added two 
Lc)^ — {Lp — Lc)'^ , i.e. the same as in 
Lc) anda^ = {{Lp-Lc)^)-{Lp-Lc)^ 



eqs. ( 40 ) and ( 41 ) . The green fit is again a Gaussian, with parameters ^ — {L 

i.e. those analytically calculated by means of the maximum of the likelihood procedure (and implemented for the 
time-period t^). What we observe is a substantial agreement between the distribution of Lp — Lc calculated on 
the numerically-generated grandcanonical ensemble and the two Gaussian fits, thus confirming the usual statistical 
interpretation of the z-scores. 



G. Akaike's Information Criterion (AIC) 



For the analysis of the DIN we have used three null models: the DRG, the DCM and the ROM. Which of these is 
the 'best' null model in terms of an optimal trade-off between parsimony and ability to replicate the data? In order to 
answer this question, we use the Akaike's Information Criterion (AIC in what follows) and the Akaike weights |41| . 
These techniques have been recently used in the analysis of other networks ;42J. 

For each snaphot t and for each model, AIC is defined as the difference between (twice) the number K of parameters 
and the log-likelihood at its maximum: 
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FIG. 21. Temporal evolution of Akaike weights for the DCM (blue) and the RCM (green) 



AIC"- 



2K* -2lnCH 



(42) 



The model with the smallest value of AIC* achieves the optimal balance between explanatory power (large log- 
likelihood) and parsimony (small numer of parameters) [H]. Across all snaphots, we found that the DRG is never the 
best model, while the DCM and RCM sometimes compete. To better discriminate between two competing models, 
the Akaike weights can be used. Given two models 1 and 2, we can calculate the two values (for each time period) 



A* = AICl - mm{AICl AICl), i = I, 2 



to define the Akaike weights |3T] as 



(43) 



(44) 



+ e- 



By definition, wi + W2 = 1- A value Wi ; 
therefore be discarded. Values wi k, w2 
discarded. 



1 implies that model i strongly outperforms the other model, which can 
: 1/2 imply that both models are very similar, and neither can easily be 



Our results are shown in fig. 21 for all shapshots. For four snapshots the two models compete (having the weights 
near the central value of 0.5), so that they should both be retained and some more refined form of multimodel inference 
would be needed (the so-called multimodel average |41j). However, for most of the remaining snapshots the DCM 
outperforms the RCM, which is therefore less effective in explaining the observed topology. Given the extreme values 
of the Akaike weights, AIC seems to classify RCM as an overfitting model most of the time, pointing out that the 
correct amount of information to use is encoded in the degree sequences only and that all the remaining higher-order 
structures (dyadic and triadic structure) should be considered as non trivial patterns revealing the self-organization 
of the DIN. 

So, even if the topological information introduced by adding the local reciprocity structure to the contraints would 
seem not to be excessive, most of the time AIC classifies it as redundant and suggests that the optimal level of 
description is the one achieved by controlling for the in- and out-degree sequences, confirming that all higher-order 
patters starting from the dyadic ones should be regarded as significant. Clearly, in order to filter out the dyadic effect 
from the triadic abundances, and select only the triads which cannot be explained in terms of a combination of dyads, 
the RCM remains the model to use. 
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FIG. 22. Entropy for the three null models DRG (purple), DCM (blue) and ROM (green). 

H. Entropy 

In order to measure how effective the chosen constraints are in 'narrowing' the ensemble of networks around the 
observed configuration, we calculated the Shannon entropy of the probability distribution induced by each null model 
[32|, |33|. To obtain comparable results across the three null models, we normalized all entropies between and 1, 
considering that the DRG and the DCM predict probability coefficients for the directed pairs while the RCM predicts 
probability coefficients for the dyads. This normalization results in the following definition: 

S^,, ^ - EAeoPNM{A)log,PNM{A) ^ ^^^^^ ^^^^^ ^^^^ 

The result is shown in fig. [22] The same observations made about the connectance are also valid for the DRG 
entropy: the magnitude of the change between t^o and ^41 is of the same (or even lower) order as the previous ones, 
so that it would be difficult to detect the crisis in terms of an anomalous behavior of Sdrg- On the other hand, 
both the DCM and the RCM entropies show a clear (even if not dramatically large) jump between 2007 and 2008, 
somehow indicating the onset of the crisis. However, there is no clear indication, neither in the DRG prediction nor 
in the DCM prediction, of a pre-crisis period. This seems to indicate that the clues of the upcoming instability are 
detectable neither in the degree sequences, nor in the local reciprocity structure, but in higher-order statistics (triadic 
motifs and core-periphery division), once the dyadic structure is kept fixed (degree sequences). Also note the small 
difference between Sdcm and Srcm- While in going from the DRG to the DCM there is clearly a large information 
gain, there is only a much smaller gain in going from the DCM to the RCM. This explains why AIC most of the time 
indicates that the information gained by the RCM over the DCM is not enough to justify the introduction of the 
additional model parameters. 
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